Geometric Clustering Using the Information Bottleneck Method
نویسندگان
چکیده
We argue that K–means and deterministic annealing algorithms for geometric clustering can be derived from the more general Information Bottleneck approach. If we cluster the identities of data points to preserve information about their location, the set of optimal solutions is massively degenerate. But if we treat the equations that define the optimal solution as an iterative algorithm, then a set of “smooth” initial conditions selects solutions with the desired geometrical properties. In addition to conceptual unification, we argue that this approach can be more efficient and robust than classic algorithms.
منابع مشابه
Data Clustering by Markovian Relaxation and the Information Bottleneck Method
We introduce a new, non-parametric and principled, distance based clustering method. This method combines a pairwise based approach with a vector-quantization method which provide a meaningful interpretation to the resulting clusters. The idea is based on turning the distance matrix into a Markov process and then examine the decay of mutual-information during the relaxation of this process. The...
متن کاملThe information bottleneck and geometric clustering
The information bottleneck (IB) approach to clustering takes a joint distribution P (X,Y ) and maps the data X to cluster labels T which retain maximal information about Y (Tishby et al., 1999). This objective results in an algorithm that clusters data points based upon the similarity of their conditional distributions P (Y | X). This is in contrast to classic “geometric clustering” algorithms ...
متن کاملUnsupervised Image Clustering Using the Information Bottleneck Method
A new method for unsupervised image category clustering is presented, based on a continuous version of a recently introduced information theoretic principle, the information bottleneck (IB). The clustering method is based on hierarchical grouping: Utilizing a Gaussian mixture model, each image in a given archive is first represented as a set of coherent regions in a selected feature space. Imag...
متن کاملRegistration-Based Segmentation Using the Information Bottleneck Method
We present two new clustering algorithms for medical image segmentation based on the multimodal image registration and the information bottleneck method. In these algorithms, the histogram bins of two registered multimodal 3D-images are clustered by minimizing the loss of mutual information between them. Thus, the clustering of histogram bins is driven by the preservation of the shared informat...
متن کاملAgglomerative Information Bottleneck
We introduce a novel distributional clustering algorithm that explicitly maximizes the mutual information per cluster between the data and given categories. This algorithm can be considered as a bottom up hard version of the recently introduced “Information Bottleneck Method”. We relate the mutual information between clusters and categories to the Bayesian classification error, which provides a...
متن کامل